A Template Based Hybrid Model for Chinese Personal Name Disambiguation

نویسندگان

  • Hao Zong
  • Derek F. Wong
  • Lidia S. Chao
چکیده

This paper proposes a template based hybrid model for Chinese Personal Name Disambiguation (CPND). The template makes use of the features of personal role such as discriminating personal name (nickname, stage name), together with the specific context of most frequent words, personal name nearest words named entities, date and time that are effective for this disambiguation task, as well as surrounding context of nominal, verbal and adjectival constituents. The construction of the templates is automatically derived from the articles that maximizes the deviation of different categories of personal names. The extraction algorithm of keyword features based on the distribution of unlabeled data is also proposed in this paper for this challenging task. In addition, an augmented similarity measure for the CPND model has been designed to calculate the similarity between a standard template and an unlabeled text. The final evaluation reveals that the proposed model can achieve the Fmeasure of 75.75% on the test data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chinese Personal Name Disambiguation Based on Person Modeling

This document presents the bakeoff results of Chinese personal name in the First CIPS-SIGHAN Joint Conference on Chinese Language Processing. The authors introduce the frame of person disambiguation system LJPD, which uses a new person model. LJPD was built in short time, and it is not given enough training and adjustment. Evaluation on LJPD shows that the precision is competitive, but the reca...

متن کامل

The Chinese Persons Name Diambiguation Evaluation: Exploration of Personal Name Disambiguation in Chinese News

Personal name disambiguation becomes hot as it provides a way to incorporate semantic understanding into information retrieval. In this campaign, we explore Chinese personal name disambiguation in news. In order to examine how well disambiguation technologies work, we concentrate on news articles, which is well-formatted and whose genre is well-studied. We then design a diagnosis test to explor...

متن کامل

Chinese Personal Name Disambiguation Based on Vector Space Model

This paper introduces the task of Chinese personal name disambiguation of the Second CIPS-SIGHAN Joint Conference on Chinese Language Processing (CLP) 2012 that Natural Language Processing Laboratory of Zhengzhou University took part in. In this task, we mainly use the Vector Space Model to disambiguate Chinese personal name. We extract different named entity features from diverse names informa...

متن کامل

A Pipeline Approach to Chinese Personal Name Disambiguation

In this paper, we describe our system for Chinese personal name disambiguation task in the first CIPSSIGHAN joint conference on Chinese Language Processing(CLP2010). We use a pipeline approach, in which preprocessing, unrelated documents discarding, Chinese personal name extension and document clustering are performed separately. Chinese personal name extension is the most important part of the...

متن کامل

Chinese Personal Name Disambiguation: Technical Report of Natural Language Processing Lab of Xiamen University

This report presents the work of our group in the Chinese personal name disambiguation workshop. We propose a system which uses a HAC algorithm to cluster the mentions referring to the same person with features extracted from the documents.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012